Cross-lingual information retrieval systems
نویسنده
چکیده
In this work, we will explore different approaches used in Cross-Lingual Information Retrieval (CLIR) systems. Mainly, CLIR systems which use statistical machine translation (SMT) systems to translate queries into collection language. This will include using SMT systems as a black box or as a white box, also the SMT systems that are tuned towards better CLIR performance. After that, we will present our approach to rerank the alternative translations using machine learning regression model. This includes also introducing our set of features which we used to train the model. After that, we adapt this reranker for new languages. We also present our query expansion approach using word-embeddings model that is trained on medical data. Finally we reinvestigate translating the document collection into query language, then we present our future work.
منابع مشابه
Query Translation for Cross-lingual Information Retrieval using Wikipedia
In this paper the system WikiTranslate is introduced that performs query translation for cross-lingual information retrieval (CLIR) that only uses Wikipedia. Queries will be mapped to Wikipedia concepts and the corresponding translations of these concepts in the target language are used to create the final query. WikiTranslate is evaluated by searching with topics in Dutch, French and Spanish i...
متن کاملModern Multilingual and Cross-lingual Information Access Technologies
In this chapter, we describe the state of the art cross-lingual and multilingual strategies and their related areas. In particular, we show a WWW-based information system called MIETTA, which allows uniform and multilingual access to heterogeneous data sources in the tourism domain. The design of the search engine is based on a new cross-lingual framework. The framework integrates a cross-lingu...
متن کاملUsing Information Extraction to Improve Cross-lingual Document Retrieval
We present a filtering mechanism using two cross-lingual information extraction (CLIE) systems for improving document relevance of cross-lingual information retrieval (CLIR) for queries conforming to predefined templates. Experiments on retrieving Chinese documents in response to English GALE arrest queries show that this approach can obtain a 12.7% absolute improvement in relevance (representi...
متن کاملLeveraging User Interaction and Social Tagging for Improving Cross-lingual Information Access in Digital Libraries
Evaluation of interactive cross-lingual information retrieval systems has been the focus of recent research. The goal is to support the users in formulating effective queries and selecting the documents which satisfy their information needs regardless of the language of the documents. This dissertation aims at harnessing the user-system interaction, extracting the added value and integrating it...
متن کاملRicoh at CLEF 2004
Abstract. This paper describes the participation of RICOH in the monolingual and cross-lingual information retrieval tasks on German Indexing and Retrieval Testdatabase (GIRT) in the Cross-Language Evaluation Forum (CLEF) 2004. We used a morphological analyzer for word decompounding and parallel corpora for cross-lingual information retrieval. The performance of cross-lingual information retrie...
متن کامل